Archive for the ‘research’ Category

Notes wiki

Sunday, October 18th, 2009

I started using a gitit wiki to maintain my notes on CS research and programming. It’s available here. For a while, I’ve kept my notes in a bunch of “loosely pandoc” files, so gitit was an easy way to wiki-fy everything—to make them easier to view/edit from any browser, and to share publicly.

At the moment my notes are highly disorganized and probably have many formatting bugs. Also, they typically don’t include topics from classes I’ve taken. I’m hoping this will become a not-too-cryptic way for me to share more information with less effort than writing blog posts.

GRE e-rater

Tuesday, June 2nd, 2009

It turns out that the ETS has a whole research division, including researchers in natural language processing who come up with stuff like e-rater and other machine essay graders, and that they publish about these systems.

According to the latest system description paper:

The feature set used with e-rater V.2 include measures of grammar, usage, mechanics, style, organization, development, lexical complexity, and prompt-specific vocabulary usage.

E-rater is part of Criterion, a web-based service that provides students with instant scoring and feedback on their submitted essays. Criterion has a number of writing analysis tools whose output form the feature vector used by e-rater. The score is a simple weighted average of the feature values.

One noteworthy detail is that in determining the parameters to use for this model, e-rater ecshews exclusively statistical machine learning (optimization) approaches in favor of allowing judgmental control, for reasons of control (to avoid unintentional skew and other undesirable statistical effects) and transparency (to make the system easier to understand and explain).

It would be interesting to see how straightforward it is to game e-rater, given the above information and access to the implementation in Criterion.